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One of the theoretical models proposed for the mechanism of gene unscrambling in some species of 
ciliates is the template-guided recombination (TGR) system by Prescott, Ehrenfeucht and Rozenberg 
which has been generalized by Daley and McQuillan from a formal language theory perspective. In 
this paper, we propose a refinement of this model that generates regular languages using the iterated 
TGR system with a finite initial language and a finite set of templates, using fewer templates and a 
smaller alphabet compared to that of the Daley-McQuillan model. To achieve Turing completeness 
using only finite components, i.e., a finite initial language and a finite set of templates, we also 
propose an extension of the contextual template-guided recombination system (CTGR system) by 
Daley and McQuillan, by adding an extra control called permitting contexts on the usage of templates. 

1 Introduction 

This paper proposes improvements in the descriptional complexity of two theoretical models of gene un- 
scrambling in ciliates: template-guided recombination (TGR) systems, and contextual template-guided 
recombination (CTGR) systems. Ciliates are a group of unicellular eukaryotic protozoans, some of which 
have the distinctive characteristic of nuclear dualism, i.e., they have two types of nuclei: a functionally 
inert micronucleus and an active macronucleus . Genes in the active macronucleus provide RNA tran- 
scripts for the maintenance of the structure and function of the cell. Genes within the micronucleus are 
usually inactive and assist only in the conjugation process. The process of "decrypting" the micronu- 
clear genes after conjugation, to obtain the functional macronuclear genes, is called gene unscrambling 
or gene assembly. 

The genes within micronuclear chromosomes are composed of protein-coding DNA segments (also 
known as macronuclear destined sequences (MDSs)) interspersed by numerous, short, non-protein-coding 
DNA segments (also called internally eliminated sequences (IESs)). Furthermore, in some species of cil- 
iates such as Oxytricha or Stylonychia, the micronuclear gene has been found to have a highly complex 
structure in which MDSs are stored in a permuted order. During the course of macronuclear develop- 
ment, these IESs are eliminated from the micronucleus by means of homologous recombination, and 
the permuted MDSs are sorted, resulting in a functionally complete macronucleus with MDSs present 
in the correct order. In the micronuclear sequence, each MDS is flanked by guiding short sequences, 
3 to 20 nucleotides long, which act as pointers in a linked list. For instance, the nth MDS is flanked 
on the left by the same short sequence which flanks the (n — l)th MDS on the right. In the process of 
gene unscrambling, homologous recombination takes place between two DNA molecules that contain 
the identical guiding short sequences at the correct MDS-IES junctions. 

Various theoretical models have been proposed in order to model the genetic unscrambling processes 
in ciliate organisms: the reversible guided recombination model, ifTOl |9[, based on binary inter- and 
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intra-molecular DNA recombination operations; the Id, hi, dlad model, (8JI7JQ31, based on three unary 
intra-molecular DNA recombination operations; the template- guided recombination (TGR) model, 021, 
where a DNA molecule from the old macronucleus conducts inter-molecular DNA recombination process 
serving as a template; the RNA-guided DNA assembly model, [1], experimentally confirmed in ifTTTl . 
where either double-stranded RNA or single-stranded RNA act as templates. 

This paper proposes two improvements of the descriptional complexity (size of template language, 
size of alphabet) of the template-guided recombination model as studied by Daley and McQuillan In 
[2], it has been showed that the Daley-McQuillan TGR system can generate all regular languages using 
iterated template-guided recombination with finite initial and template languages. For this model, the 
considered gene rearrangement processes take place in a stochastic style in vivo environment. In such a 
biological setting, it is significant to consider the size of the template language ||6j|3, because sufficient 
copies of each template must be available throughout the recombination process. This is essential to 
confirm the accessibility of a template in the right place at the proper time according to the demand. 
Hence, the number of the unique templates should be as low as possible. The first aim of this paper 
is a reduction of the size of the template language by introducing a new approach to generate regular 
languages applying iterated TGR system with a finite initial language and a small finite set of templates. 

The second aim of this paper is the reduction of the size of the template language (from regular to fi- 
nite), in the extension of the template-guided recombination model called the contextual template-guided 
recombination system ( CTGR system), 10. Recall that a CTGR is a TGR enhanced with "deletion con- 
texts", the introduction of which made it possible to enhance the TGR computational power to that of 
Turing machines. Our reason for wanting to achieve a reduction of the template set size is the obvious 
one, namely that handling an infinite regular set of templates in a biological setting is impossible. To 
achieve our goal, we employ an additional control over the templates in the form of "permitting con- 
texts". We namely introduce the contextual template-guided recombination system ( CTGR system) using 
permitting contexts as an extension of the CTGR system, and prove that an iterated version of this system 
has the computational power of a Turing machine, but only uses a finite initial language and a finite set 
of templates. 

The paper is organized as follows. Section [2] introduces our new approach for generating the family 
of regular languages by using iterated TGR systems with n 2 templates, compared to n 3 templates in 0. 
This reduction in descriptional complexity is achieved at the expense of using a filtering set to discard 
unintended results. Section|3]describes our proposed CTGR system using permitting contexts that, unlike 
CTGR systems, are able to characterize the recursively enumerable languages by using only a finite base 
language and a. finite set of templates. This reduction in the size of the template language is achieved by 
introducing an additional control mechanism, the permitting context, to CTGR. 

We end this introduction by some formal definitions and notations. An alphabet is a finite and 
nonempty set of symbols. A word or a string is a finite sequence of symbols. Let E be an alphabet. By 
E* we denote the set of all words over E that includes the empty one denoted by A. The set of nonempty 
words over E, i.e., E* \ {A}, is denoted by E + . The length of a word x G E* is denoted by \x\. For k G N, 
letE^ fc = {w\w€Z*,\w\ >k}. 

For two alphabets E, A, a morphism is a function h : E* — > A* satisfying h(xy) = h(x)h(y) for all 
x,y G E*. A morphism h : E* — > A* is called a coding if h(a) G A for all a G E and a weak coding 
if h(a) G AU {A}. We denote by RE, CS, CF, LIN, and REG the families of languages generated by 
arbitrary, context-sensitive, context-free, linear, and regular grammars, respectively. By FIN we denote 
the family of finite languages. For additional formal language theory definitions and notations the reader 
is referred to tfTSl . 
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2 TGR systems with fewer templates 

This section proposes a refinement of the template-guided recombination (TGR) model as studied in the 
model by Daley and McQuillan [2] , that is able to generate the family of regular languages by using a 
reduced number of templates and a smaller alphabet. 

Definition 1. (ft2$l) A template-guided recombination system (or TGR system) is a four tuple 

q = (T, Z, n\ , n-i) where £ is a finite alphabet, TCI* is the template language, n\ is the minimum MDS 

length and ri2 is the minimum pointer length. 

For a TGR system g = (T,£, 721,722) and a language L C £*, g(L) = {w G £* | (x,y) \~t w for some 
x,y G L,t G T} where (x,y) \- t w iff x = uaf3d,y = e(3jv,t = aj3^,w = uaf3jv,u,v,d,e G £*,a,7 G 
L- Ul , /3 G Z-™ 2 . L is sometimes called the base, or initial language. 

Note that, if x is a segment of the micronuclear DNA sequence that contains the nth MDS a, and 
y is a segment of the micronuclear DNA sequence that contains the (n + l)st MDS 7, then the recom- 
bination between x and y guided by the template t will result in bringing the MDSs n and (n+ 1) in 
the correct order in the intermediate DNA sequence w, regardless of their original position in the mi- 
cronuclear sequence. A sequence of such template-guided recombinations is thought to accomplish the 
gene unscrambling, and the transformation of the micronuclear DNA sequence in the macronuclear DNA 
sequence in ciliates. 

For a TGR system g = (T,£, 721,722) and a language L C £*, g*(L) is defined as follows: 

00 

g Q (L) = L, g n+ \L) = g n (L)Ug(g n (L)),n>0, g*(L) = [j g n (L). 

n=0 

If Ci,£.2 are two language families, then rtl* (£1, £2, ^1,^2) = {(f{L) \ L G C\,g = (T,L,n\,ri2),T G 
£2} and rtl* (£i,£ 2 ) = W (£i,£2,ni,n 2 ) | ni,n 2 G N}. 

In J2J Prop. 15], Daley and McQuillan prove that all regular languages can be generated using iter- 
ated template-guided recombination systems with finite initial and template languages, i.e., every regular 
language is a coding of a language in the family rtl* (FIN, FIN). The limitation of the Daley-McQuillan 
model [2] is that the size of the template language and the alphabet was not meant to be optimized. Since 
the size of the template language will have a great impact on this type of model during in vivo computa- 
tion, this is an important factor. Our aim is to reduce this number of templates. We namely introduce a 
new approach to generate regular languages using iterated template-guided recombination, using a finite 
initial language, a finite set of templates, and a weak coding. We provide a simpler construction than that 
of El Prop. 15], with fewer templates and a smaller alphabet. 

Proposition 1. Each regular language LCI* can be written in the form L = h(g*(Lo) n R), where R 
is a regular language, h is a weak coding homomorphism, g = (T,£ ,1,1) is a TGR system, T is a finite 
set of templates and Lq CI* is a finite language. 

Proof. Let L G REG be generated by a regular grammar G = (N,L,S,P) with the rules in P of the 
form X — > aY, X — > a \ A, for X, Y G N, a G £. We construct a TGR system g = (T, £ , n\ , 722), where 
72l = 72 2 = 1 and the alphabet £ = N U I U {#}. Here, # is a new symbol which assists to complete the 
recombination process acting as an end marker. Then we construct a finite base language Lq C £ * and 
a template language T C £ * as follows. 

We define the finite base language by: 

Li = {Sa# I 3 S -> a G P, a G I}, L 2 = {SaX \3S -> aX G P,a G I}, 
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L 3 = {XbY | 3X -»• 6Y G P,-X",y G AT, 6 G Z}, 

L 4 = {XaX | -»• al G P,X G iV,a G I}, L 5 = {Ja# | ]J->a£ P}, 
L 6 = {Xm | 3 X ->• A G P}, L = Li U L 2 U L 3 U L 4 U L 5 U L 6 . 
The finite template language is defined by: 

Ti = {aXb \a,b,X,Y,Z G Z' ,3Y -»• al G P, either 3X^6Ze?or3X^6eP}, 
T 2 = {ola | a,X aJ G P}, 

T 3 = {aX#| a,X,y,#Gr , ,3y^aXGP,3X^AGP},r = TiUT2Ur 3 . 

Note that for example, L4 C L 3 , L 2 C L 3 , and T 2 C Ti. However, we made these separations for the 
purpose of the clarity of the proof. 

In order to eliminate all non terminals and the new symbol, we consider the weak coding h defined 
by h(X) = A, for any X G N, h(a) = a, for any o£l, h(#) = A. Moreover, we consider the language 
R = {S} (ZiV)* {#,##}, whose purpose is to ensure that only strings of the correct form will be accepted, 
by removing other unintended strings. 

We claim that L = h(g*(L ) D R). 

For the "C" inclusion, in order to obtain a valid derivation in G and to continue recombinations, we 
consider the string SaX from group L 2 as the first string in the recombinations. At this stage, through 
recombination, the application of the rules of the form X — > bY G P can be achieved as follows. During 
the recombination, a string XbY from group L 3 as the second string can be recombined with the string 
SaX, and an appropriate template aXb from group Ti, can be used to produce the string SaXbY which 
is of the form {5}(£A^)*, with X,Y G iV and a, 6 G E. By using only the templates from group T\, the 
simulation of the rules of the form X — > bY G P is possible, because no other template from group T 2 , 
T 3 can be used. The simulation is as follows: 

(SaX, XbY) h aXb SaXbY. 

The above mentioned simulation process can be repeated an arbitrary number of times according to 
the templates in group T\ . Likewise, rules of the form X — > aX G P can be simulated that produce the 
string SaXaX using a template from T 2 and considering the second string from group L\ as follows: 

(SaX,XaX) h aXa SaXaX. 

The application of the rules of the form X — > aX G P can also be simulated repeatedly. In general, for 
representing an intermediate recombination, if u, v G £>*(Po) illustrates derivations of G and u = u aY, 
v = Ybv , where u G {S'KZiV)*, v G (NY,)* N, a G Z, b G Z, then the resulting recombined string 
(u,v) \- a Yb u aYbv that is a string of the form {5}(ZiV)* can be generated which corresponds to an 
intermediate computation of the form S 5N in G where 5 G Z*. 

If a template finds more than one matching point in the first string, then the template can attach to 
any of those points and a matching second string from Lo as guided by the template can be recombined 
with the first string. For example, such a recombination can happen to a string of the form 
SaiXia 2 X 2 . . .aiXiai + \Xi + \ . . . a^-iXk-iakXh. 

Along this string if a; L Xi = auX^ for some 1 < i < k, then the recombination guided by a template 
aiXib can take place either at the matching position aiXi or at the matching position a^X^ between 
the above first string and the second string of the form X; L bY . This recombination will produce the re- 
sulting string either of the form Sa\X\aiX2 . ■ ■ aiXibY or of the form Sa\X ^2X2 ■ ■ . a{Xiai + iXi + i . . . 
ak-iXk-iakXkbY, respectively. Note however that any recombination that does not happen at the right- 
most end of the sentential form has only the effect of "resetting" the derivation a few steps backward. 
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Thus, without loss of generality, we will hereafter assume that any derivation that results in a terminal 
word has an equivalent rightmost derivation. We will only discuss these rightmost derivations. 

Note also that recombinations can proceed in parallel, for example, a recombination can take place 
between a string of the form 80,1X10,2X2 ■ . ■ 0{Xi and a string of the form XjOj+iXj+i . . . Ok-iXk-iOkXk 
or alternatively XjOj+iXj+i . . . Ofc_ 1 Xfc_ 1 afc# using an appropriate template of the form aiXi<ii + \ that 
will lead to a resulting string of the form 501X102X2 . . . OjXjOj + iXj + i . . . <ifc_ 1 Xfc_ 1 o^X^ or 
501X102X2 . . . a,iXiai + \Xi + i . . . Ofc_iXfc_iOfc#, respectively. Any such derivation, however, can be re- 
placed by a derivation that starts from a word containing 5 and proceeds unidirectionally towards a 
terminal word. 

Let us now examine the simulation of the terminating rules. Here, it is assumed that a string of the 
form Sa\X\a2X2 ■ ■ . a n _]X n _i 6 {S}(lZN)* is to be considered as the first string that was produced at 
the previous step. Now the application of the rules of the form X n _i 4«„eP can be achieved using 
the second string of the form X n _ia„# from group L5 and applying the matching template a n -\X n -\a n 
from group T\. After recombination, the produced string is 501X102X2 ■ ■ ■ a n -\X n ^\a n # = w which 
is of the form {S , }(ZA^)*{#} and corresponds to our intended terminal word. At this point, any further 
recombination at the right most end of this produced terminal string stops because no matching template 
can be found in the finite set of templates T to guide recombination with this string: 

(Sa\X\a,2X2 ■ . .a n -\X n -\,X n -\a n #) Kj„_iX„_ia„ 501X102X2 . . . a„_iX n _ia n #. 

Moreover, for simulating a rule of the form X n -\ — > A, the required second string is from group 
and the corresponding template from the group T3. Recombination yields a string S 0,1X10,2X2 . ■ ■ 
a n -iX n -i## of the form {5 , }(ZA^)*{##} that is the terminal string and further recombination cannot 
take place: 

(SaiXia 2 X2 . ■ . a n _iX n _i,X„_i##) l~a n _ 1 x n „i# 5*01X102X2 . . . a n _iX„_i##. 

By construction, it is clear that each string in q*(Lq) corresponds to a derivation in G, and the simu- 
lation of a derivation is possible only by using recombinations according to the corresponding template 
from the finite template language T. Accordingly, each derivation in G of the form 

5 => aiXi ... =>• a\02- ■ - OkXk =>■ a\ 02 ... afc«fc+i^+i ••• 

aia 2 ...a„_iX n _i aia 2 ...a n = w, 
where 1 < k < n, X^ — > Ok+iX^i G P, X n -\ ->o„eP, corresponds to a computation in q of the form 

{Sa\Xi,X 102X2) \- ai Xia 2 5aiXia2X 2 ...5aiXia2X 2 ...afcX fc afc + iX fe+ i 

. . .501X102X2 . . .OfcXfcafc+iXfc+i . . .a n _iX n _ia n # = w , or 

501X102X2 . . .afcXfeafc + iXfe + i . . .a n _iX n _i## = w . 
Therefore, we can say from the above description that a terminal string according to the grammar G 
is achievable only by starting the recombination with a string that begins with the start symbol 5 (that 
means considering as the first string a string containing the start symbol 5 at the beginning) and then 
proceeding by a series of recombination processes according to the appropriate templates from the finite 
template language T for an arbitrary number of times, which end up with the end marker # and simulate 
a derivation according to G. Afterwards, intersecting the language R = {5} (LN)* {#,##} with the set 
of generated strings, we obtain our intended terminal strings. In this way, we are able to find a string 
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w G £*(Lo) n i? and then the application of the weak coding h(w ) = u> G E* allows us to obtain the 
exact string generated by a derivation in G. Thus, every derivation in G can be simulated. 

Hence, we obtain L C h(g*(Lo) n -R). The other inclusion follows because the only recombinations 
that can happen according to g lead either to words that are eliminated by the filter, or to words in L after 
applying the weak coding. □ 

Let us now compare the size of the template language we obtained with that of the Daley-McQuillan 
model [2J. The Daley-McQuillan model [2] requires three production rules to construct a template based 
on their definition of the template language in the following. T = {[X, a,Y][Y, b,Z][Z, c, W]} where 
[X, a, Y] , [Y, b, Z] , [Z, c, W]eV,V is an alphabet and X aY, Y bZ, Z — > cW G P. If the number 
of production rules in the grammar is \P\ = n, then based on this definition the template language has a 
cardinality of n 3 . 

Our construction requires two production rules to construct a template. In the worst case we can 
have n 2 templates where n is the number of production rules in the simulated grammar. In addition to 
the size of templates, the size of the TGR alphabet E' in our construction is small: one plus the number of 
terminals and nonterminals in the simulated grammar. In the Daley-McQuillan model as described above, 
the alphabet V can be much larger, and it also depends on the number of productions of the grammar. 
Although we require fewer templates and alphabet, our model has one limitation, i.e., it requires a filter 
to discard unintended results, while the Daley-McQuillan model requires only the correct recombination 
to occur according to the constructed matching templates. 



3 CTGR systems with permitting contexts 

As shown in (3l |H |3, the finiteness of the initial language and the set of templates restricts the computa- 
tional power of a TGR system. In fact, even with a regular initial language and a regular set of templates, 
iterated TGR systems can generate at most regular languages 0. 

Daley and McQuillan Q have added a new feature called "deletion context" to enhance the com- 
putational power of template-guided recombination. Their extension of the TGR system is called the 
contextual template-guided recombination system (CTGR system). In 0, it was shown that arbitrary 
recursively enumerable languages can be generated by iterated CTGR with a regular set of templates and 
a finite initial language, with the help of taking intersection with the Kleene star of the terminal alphabet, 
and a coding. From a practical viewpoint, dealing with a regular set of templates is not realistic in the 
sense that we cannot manage an infinite "computer". 

To achieve the finiteness of the employed component sets while preserving the computational power 
of CTGR, we impose an additional control on the templates in order to restrict their usage. More pre- 
cisely, we associate each template with a set of "permitting contexts": strings that must appear as sub- 
words within the two participating words if this particular template is to be used for their recombination. 
The idea of permitting contexts has been previously used in the context of splicing systems, a formal 
model of DNA recombination that uses restriction enzymes and ligases [12]. 

Definition 2. A contextual template-guided recombination system (CTGR system) using permitting con- 
texts is a quadruple g p = (T, E, m, TI2), where E is a finite alphabet, n\ G N is the minimum MDS length 
and 112 G N is the minimum pointer length, T is a set of triples (templates using permitting contexts) of 
the form t p = (t; C\ , C2) with t = e\#af3^#d\ being a template over E and C\ , C2 being finite subsets of 
E*. To such a triple t p we associate the word 

r(tp) = ei#af3j#di$ai&. . .&afc$&i&. . .&b m , 
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where C\ = {ct\ , a^}, C 2 = {b\ , b m }, k,m>0 and $, &,# are new special symbols not included 
in E. We define r(T) = {r(t p ) \ t p G T}. 

For a CTGR system using permitting contexts g p = (T,E, n 1,712) and a language L C E*, we define 
Qp(L) = {w G E* I (x,y) \~t p w for some x,y G L and t p G T}, where (x,y) \~l p w if and only if x = 
uaj3d\d, y = ee\fi^v, t p = (e\#af3^#du{a\ , . . . , a*;}, {b\, . . .,b m }), w = ua(3"/v,u,v,d, e G E*, a, 7 G 
E- ni , (3 G E-™ 2 . Every element that belongs to C\ appears as a substring in x and every element that 
belongs to C 2 appears as a substring in y, i.e., a» G sub(x) for 1 < i < k, bj G sub(y) for 1 <j < Tn; 
moreover, ifC\ = { A} or C 2 = {A}, then we assume that no constrain is imposed on x and y respectively. 

For a CTGR system using permitting contexts q p = (T,E, 77-1,712) and a language L C E*, a template 
language T, we can define an iterated version of g p (L) similarly as for TGR systems. 

The following proposition shows that iterated CTGR system using permitting contexts can generate 
arbitrary recursively enumerable languages using a finite initial language and a finite set of templates 
with the help of intersection with a filter language and, at last, applying a weak coding homomorphism. 

Proposition 2. Every recursively enumerable language L C E* can be written in the form L = h(L' n 
L\), where h is a weak coding homomorphism, L\ is a regular language and L = q p {Lq) with Lq a 
finite language. 

Proof. Consider a Chomsky type-0 grammar G = (N, E, S, P) in Kuroda normal form, where L(G) = L 
and the production rules in P are of the forms A -)■ EC, AE -t CD, A ->• a \ A for A, C, D, E G N, 
a G E. Let us denote U = A^UEU {B,Bi, B2], where B,B\,B2 are new symbols. 

We then construct a CTGR system using permitting contexts q p = (T, V, 1 , 1 ) 
where V = NULU{B,B U B 2 ,X,X' ,Y,Z,Z'}U{Y b \ b G U} 

and T contains the following templates using permitting contexts: 

Simulate : 1. Z#cavY#uY; {X},{\}, for a,c G U,Z,Y G V,u v G P, 
Rotate : 2. Z#caY b #bY ; {X} , {A} , for a,b,ceU,Z,Y G V, 

3. X#X'bde#Z; {\},{Y b }, for 6, d, e G U, Z, X G V, 

4. Z#caY#Y b ;{X'},{\}, for Z, F, Y b G 

5. X'#Xac#Z;{\},{Y},for X' ,X,Z £ V, 

Terminate : 6. XBB\ B 2 #abc#Z' ; { A} , { Y} , for G V,B,B U B 2 G J7. 

We define the following languages which are included in the initial finite language Lq C V* : 
L x = {XB^i^SY}, L 2 = {ZavY \a£U,u^veP}, 
L 3 = {ZaY b \aeU},L 4 = {X'baZ \ b,a G £/}, 
L 5 = {ZaY \a£U},L 6 = {XaZ \ a}, L 7 = {abZ' \ a,b G C/}. 
We denote Lq = L\ U -L2 U L3 U L4 U U L(, U L7, and Lo acts as the initial language. 

For the construction of this system, the idea we use is the well-known proof technique, "rotate-and- 
simulate procedure", which was effectively used in other contexts lfT2l in order to allow the simulation 
of a rule that applies to a symbol in the middle of the word by first moving that symbol to the right hand 
end of the word, simulating the rule, and returning the result to its original place. 

Throughout this construction we assume x and y to be, respectively, the first word and the second 
word of the recombination as defined in Definition [2] 

The starting of the simulation based on the derivation steps in G requires to consider the word 
XBB\B 2 SY as the first word. Indeed, any other choice of start word leads to derivations of words 
of illegal form (not in XBBiB 2 L*Y). Throughout the derivation steps this word is bordered by X or its 
variant X at the left end, as well as by Y or its variant Y b ,beU at the right end, X, X and Y, Y b make 
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the left respectively right extremity of the word. Likewise, the symbol B always signals the beginning 
of the word, i.e., the sentential forms of G, which facilitates the permutation of the word and £>i , B 2 are 
included to provide the contexts for recombination. 

Note that all the templates with permitting contexts in T include symbols Z or Z that have thus 
to be present in one of the two words taking part in the recombination. Furthermore, the words con- 
taining symbols Z and Z are from the initial language Lq but will not appear in the resulting word of 
recombination. This guarantees that each recombination has to happen between the current word which 
is produced in the previous recombination and at least one word from Lq. The simulation of a derivation 
in G initiates with the application of template 1 to XBB\B2SY G L\ and ZavY G L2, where initially 
w = BB1B2S, 54t)£P and a eU. The word obtained through the recombination is XBB1B2VY: 

{XBB l B 2 SY,ZB 2 vY) h tp XBB x B 2 vY 

for t p = (Z#cavY#uY;{X},{\}) = (Z#B l B 2 vY#SY;{X},{\}), where S -> v G P, c,a G U and 
w = BB 1 B 2 S. 

Generally, considering a word Xx\BB\B2X2uY and u — > v G P, the resulting word will be 
Xx\BB\B2X2V Y applying the associated templates from group 1. Here, w = wicau = x\BB\B2X2U. 
This simulates a derivation step X2UX\ =>• x 2 vx\ in G. The derivation is as follows: 

(Xw\cauY, ZavY) \- t Xw\cavY 

for t p = (Z#cavY#uY;{X},{\}), where u^veP,c,aeU. 

In this simulation step, no other templates from groups 2-6 can be applied except the templates from 
group 1 because of imposed restriction as deletion contexts and permitting contexts on the usage of the 
templates. Afterwards, we come to the rotation process that is necessary so as to move symbols from the 
right hand end of the current word to the left hand end. This rotation process can be explained by the 
following steps: 

Step 1: We can start the rotation process using the corresponding template from group 2 with a 
word XwbY, where b G (N(JZ)*,w G (N UZ)*{BB 1 B 2 }(N UE)* (respectively XwbY, where b G 
{B,B h B 2 },w G (JVUL)*). In this step, x = XwbY = X Wl cabY, y = ZaY b G L 3 : 

(XwicabY, ZaYb) \~t p XwicaYf, 

for tp = (Z#caY b #bY; {X},{\}), where wb G (NU L)*{BBi B 2 } (N U £)* , b G N U £ U {B, B\ , B 2 }. 

Step 2: After applying the template from group 2 in Step 1, we obtained the word XwicaY b , which 
we rewrite as Xdew2Y\ ) where w = w\ca = dew2- Then we continue the rotation process using the 
matching template from group 3 with x = X bdZ G L4, y = Xdew2Y b : 

(x'bdZ,Xdew 2 Y b ) h tp x'bdew 2 Y b 

fori p = (X#X' bde#Z;{X},{Y b }), where 6 G NULU{B,B U B 2 }. 

Step 3: The resulting word from the previous step is of the form X bdew2Y b , which can be written 
of the form X wY b = X wj,caY b . In this step we will apply the matching template from group 4 where 

x = X w^caY b , y = ZaY G L5: 

(X W3caY b ,ZaY) \- tp X w^caY 



for t p = (Z#caY#Y b ;{X'},{\}) where c,a G iVUEU {B,B U B 2 }. 
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Step 4: The recombined word from Step 3 is X ' wj,caY ', in general, the outcome of step 3 is a word 
of the form X acw^Y '. Lastly, we complete the rotation process by using a template from group 5 where 

x = XaZ G L6, y = X acw^Y: 

(XaZ,X 0LCW4Y) \- t XacwdX 

fort p = (X'#Xac#Z;{\},{Y}) where a,c G NUZU{B,B l ,B 2 }. 

The above mentioned rotation-steps produced the word Xbacw\Y = XbwY which implies that start- 
ing from the word XwbY and applying steps 1 - 4, we achieve the word XbwY having the same end 
markers. In this way, we are able to move the symbol b from the right-hand end to the left-hand end of 
the word that accomplishes the rotation of the underlying sentential form. These rotation-steps can be 
repeated an arbitrary number of times and thus provide every circular permutation of the word flanked 
by X and Y. 

Using a template from group 1 to each word of the form XwY when w ended by the left hand part of 
a rule in P, it is possible to simulate the application of all rules of P at a desired position corresponding 
to the sentential form of G, by means of the four rotation-steps. 

It is observed that from the initial word XBB\B 2 SY , each produced word in every step does not 
include the symbols Z, Z , that is, the word is of the form a\X\BB\B 2 x 2 a 2 in which the pair (01,0:2) 
is one of the four pairs (X, Y), (X, Yj), (X , Y5), (X ,Y), b G U. In fact, these symbols being present in 
the templates of T serve as permitting contexts that restrict the regulation of the recombination process 
of this system g p . 

Now we come to the termination process. Applying the terminating template from group 6, we can 
remove XBB\B 2 only when Y is present and the symbols B,B\,B 2 together as a word BB\B 2 is 
adjacent to X. Here, x = abZ G Lj, y = Xacw^Y = XBB\B 2 bcw^Y : 

(dbZ 1 ,XBBiB 2 bcw s Y) h tp abcw 5 Y 

for t p = (XBB l B 2 #abc#Z' ;{\},{Y}), where w G (NU'L)*{BBiB 2 }(NU'L)*, 6, a,c G NUHU {B, 
B U B 2 }. 

Now our achieved word is of the form abcw$Y = wY = w G q*(Lq), i.e., L = q*(Lq). The in- 
tersection operation with the language L\ =L*Y will filter out the words that are not in proper form. 
Furthermore, we define a weak coding homomorphism which eliminate the right end marker Y leaving 
other letters unchanged. Let us now define a weak coding homomorphism by h(a) = a , for any a£l, 
h(Y) = A. 

Thus, we obtain a word in Z* by applying the weak coding homomorphism where w G h(L n L\). 
Finally, from the above construction we can produce each word in L(G) and we say that L(G) C 
h(g*(Lo) n£*Y). Conversely, the opposite inclusion is held by this system. Therefore, h(g*(Lo)n 
L*Y) C L(G). □ 

4 Conclusions 

This paper improves on the descriptional complexity (size of the template language) from n 3 to n 2 in the 
case of template-guided recombination (TGR) systems, and from regular to finite in the case of contextual 
template-guided recombination (CTGR) systems. These reductions are obtained at the expense of using 
a filtering language in the case of TGR, and of an additional control (permitting contexts) in the case of 
CTGR. 



L. Kari, A. Rahman 
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